Indexing without the Index: Scalable Multidimensional Aggregation for Data Warehouses
نویسندگان
چکیده
Aggregation plays an important role in data warehousing and has received much attention to date. However, existing techniques do not suuciently address the issue of making the computation of multidimensional aggregates scale with increasing dimensionality. At the same time the beneet of taking full advantage of the hard disk geometry is often overlooked. This paper presents the Multiresolution File Scan (MFS) approach which is based on a selection of at les which are accessed with fast sequential I/O operations. Its simple structure and low storage overhead allow MFS to scale to high dimensionality while making the best use of the increasing transfer speed of modern hard disks. We show that MFS out-performs multidimensional index structures, even if these structures are bulk-loaded and hence optimized for query processing. Our approach can incorporate a priori knowledge about the query workload and is applicable to all distributive (e.g., COUNT, SUM, MAX, MIN) and algebraic (e.g., average) aggregate operators .
منابع مشابه
Aggregation Algorithms for Very Large Compressed Data Warehouses
Many efficient algorithms to compute multidimensional aggregation and Cube for relational OLAP have been developed. However, to our knowledge, there is nothing to date in the literature on aggregation algorithms on compressed data warehouses for multidimensional OLAP. This paper presents a set of aggregation algorithms on very large compressed data warehouses for multidimensional OLAP. These al...
متن کاملEfficient Aggregation Algorithms for Compressed Data Warehouses
ÐAggregation and cube are important operations for online analytical processing (OLAP). Many efficient algorithms to compute aggregation and cube for relational OLAP have been developed. Some work has been done on efficiently computing cube for multidimensional data warehouses that store data sets in multidimensional arrays rather than in tables. However, to our knowledge, there is nothing to d...
متن کاملOn the Requirements for User-Centric Spatial Data Warehousing and SOLAP
Data warehouses and OLAP systems help to analyze complex multidimensional data and provide decision support. With the availability of large amounts of spatial data in recent years, several new models have been proposed to enable the integration of spatial data in data warehouses and to help analyze such data. This is often achieved by a combination of GIS and spatial analysis tools with OLAP an...
متن کاملCost-based optimization of aggregation star queries on hierarchically clustered data warehouses
A methodology recently proposed to improve processing of star queries on data warehouses is the clustering and indexing of fact tables using their multidimensional hierarchies [DRSN98, MRB99, KS01]. Due to this improved organization schema, processing of aggregation star queries changes dramatically creating new optimization opportunities. An important optimization technique is the so-called pr...
متن کاملTemporal Aggregation over Data Streams Using Multiple Granularities
Temporal aggregation is an important but costly operation for applications that maintain time-evolving data (data warehouses, temporal databases, etc.). In this paper we examine the problem of computing temporal aggregates over data streams. Such aggregates are maintained using multiple levels of temporal granularities: older data is aggregated using coarser granularities while more recent data...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002